AITopics | nested model

Collaborating Authors

nested model

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Mixture of Nested Experts: Adaptive Processing of Visual Tokens Gagan Jain Nidhi Hegde Aditya Kusupati

Neural Information Processing SystemsFeb-15-2026, 15:14:00 GMT

We further highlight MoNE's adaptability by showcasing its ability to maintain strong performance across different

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)

Add feedback

The Morgan-Pitman Test of Equality of Variances and its Application to Machine Learning Model Evaluation and Selection

Arratia, Argimiro, Cabaña, Alejandra, Mordecki, Ernesto, Rovira-Parra, Gerard

arXiv.org Machine LearningSep-16-2025

Model selection in non-linear models often prioritizes performance metrics over statistical tests, limiting the ability to account for sampling variability. We propose the use of a statistical test to assess the equality of variances in forecasting errors. The test builds upon the classic Morgan-Pitman approach, incorporating enhancements to ensure robustness against data with heavy-tailed distributions or outliers with high variance, plus a strategy to make residuals from machine learning models statistically independent. Through a series of simulations and real-world data applications, we demonstrate the test's effectiveness and practical utility, offering a reliable tool for model evaluation and selection in diverse contexts.

neural network, residual variance equality test, variance, (13 more...)

arXiv.org Machine Learning

2509.12185

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > Spain (0.05)
South America > Uruguay > Montevideo > Montevideo (0.04)
Europe > Hungary > Budapest > Budapest (0.04)

Genre: Research Report > Experimental Study (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)

Add feedback

Savage-Dickey density ratio estimation with normalizing flows for Bayesian model comparison

Lin, Kiyam, Polanska, Alicja, Piras, Davide, Mancini, Alessio Spurio, McEwen, Jason D.

arXiv.org Machine LearningJun-6-2025

A core motivation of science is to evaluate which scientific model best explains observed data. Bayesian model comparison provides a principled statistical approach to comparing scientific models and has found widespread application within cosmology and astrophysics. Calculating the Bayesian evidence is computationally challenging, especially as we continue to explore increasingly more complex models. The Savage-Dickey density ratio (SDDR) provides a method to calculate the Bayes factor (evidence ratio) between two nested models using only posterior samples from the super model. The SDDR requires the calculation of a normalised marginal distribution over the extra parameters of the super model, which has typically been performed using classical density estimators, such as histograms. Classical density estimators, however, can struggle to scale to high-dimensional settings. We introduce a neural SDDR approach using normalizing flows that can scale to settings where the super model contains a large number of extra parameters. We demonstrate the effectiveness of this neural SDDR methodology applied to both toy and realistic cosmological examples. For a field-level inference setting, we show that Bayes factors computed for a Bayesian hierarchical model (BHM) and simulation-based inference (SBI) approach are consistent, providing further validation that SBI extracts as much cosmological information from the field as the BHM approach. The SDDR estimator with normalizing flows is implemented in the open-source harmonic Python package.

artificial intelligence, bayesian inference, machine learning, (16 more...)

arXiv.org Machine Learning

2506.04339

Country:

Europe > Switzerland (0.04)
Europe > United Kingdom (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Masked Generative Nested Transformers with Decode Time Scaling

Goyal, Sahil, Tula, Debapriya, Jain, Gagan, Shenoy, Pradeep, Jain, Prateek, Paul, Sujoy

arXiv.org Artificial IntelligenceFeb-1-2025

Recent advances in visual generation have made significant strides in producing content of exceptional quality. However, most methods suffer from a fundamental problem - a bottleneck of inference computational efficiency. Most of these algorithms involve multiple passes over a transformer model to generate tokens or denoise inputs. However, the model size is kept consistent throughout all iterations, which makes it computationally expensive. In this work, we aim to address this issue primarily through two key ideas - (a) not all parts of the generation process need equal compute, and we design a decode time model scaling schedule to utilize compute effectively, and (b) we can cache and reuse some of the computation. Combining these two ideas leads to using smaller models to process more tokens while large models process fewer tokens. These different-sized models do not increase the parameter size, as they share parameters. We rigorously experiment with ImageNet256$\times$256 , UCF101, and Kinetics600 to showcase the efficacy of the proposed method for image/video generation and frame prediction. Our experiments show that with almost $3\times$ less compute than baseline, our model obtains competitive performance.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2502.00382

Country: North America > United States > California > Los Angeles County > Los Angeles (0.14)

Genre: Research Report (0.85)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

A Nested Model for AI Design and Validation

Dubey, Akshat, Yang, Zewen, Hattab, Georges

arXiv.org Artificial IntelligenceAug-1-2024

The growing AI field faces trust, transparency, fairness, and discrimination challenges. Despite the need for new regulations, there is a mismatch between regulatory science and AI, preventing a consistent framework. A five-layer nested model for AI design and validation aims to address these issues and streamline AI application design and validation, improving fairness, trust, and AI adoption. This model aligns with regulations, addresses AI practitioner's daily challenges, and offers prescriptive guidance for determining appropriate evaluation approaches by identifying unique validity threats. We have three recommendations motivated by this model: authors should distinguish between layers when claiming contributions to clarify the specific areas in which the contribution is made and to avoid confusion, authors should explicitly state upstream assumptions to ensure that the context and limitations of their AI system are clearly understood, AI venues should promote thorough testing and validation of AI systems and their compliance with regulatory requirements.

nested model, regulation, requirement, (13 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.isci.2024.110603

2407.16888

Country:

Europe > Germany (0.04)
North America > United States > Pennsylvania (0.04)
North America > United States > New York > New York County > New York City (0.04)
(3 more...)

Genre: Research Report (1.00)

Industry:

Law > Statutes (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
Government > Regional Government > North America Government > United States Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
(2 more...)

Add feedback

Mixture of Nested Experts: Adaptive Processing of Visual Tokens

Jain, Gagan, Hegde, Nidhi, Kusupati, Aditya, Nagrani, Arsha, Buch, Shyamal, Jain, Prateek, Arnab, Anurag, Paul, Sujoy

arXiv.org Artificial IntelligenceJul-30-2024

The visual medium (images and videos) naturally contains a large amount of information redundancy, thereby providing a great opportunity for leveraging efficiency in processing. While Vision Transformer (ViT) based models scale effectively to large data regimes, they fail to capitalize on this inherent redundancy, leading to higher computational costs. Mixture of Experts (MoE) networks demonstrate scalability while maintaining same inference-time costs, but they come with a larger parameter footprint. We present Mixture of Nested Experts (MoNE), which utilizes a nested structure for experts, wherein individual experts fall on an increasing compute-accuracy curve. Given a compute budget, MoNE learns to dynamically choose tokens in a priority order, and thus redundant tokens are processed through cheaper nested experts. Using this framework, we achieve equivalent performance as the baseline models, while reducing inference time compute by over two-fold. We validate our approach on standard image and video datasets - ImageNet-21K, Kinetics400, and Something-Something-v2. We further highlight MoNE$'$s adaptability by showcasing its ability to maintain strong performance across different inference-time compute budgets on videos, using only a single trained model.

mone, nested model, transformer, (15 more...)

arXiv.org Artificial Intelligence

2407.19985

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)

Add feedback

Comparing Model Evaluation Techniques Part 3: Regression Models - DataScienceCentral.com

#artificialintelligenceJan-19-2022, 06:06:06 GMT

In this post, I'll take a look at how you can compare regression models. Comparing regression models is perhaps one of the trickiest tasks to complete in the "comparing models" arena; The reason is that there are literally dozens of statistics you can calculate to compare regression models, including: This list isn't exhaustive–there are many other tools, tests and plots at your disposal. Rather than discuss the statistics in detail, I chose to focus this post on comparing a few of the most popular regression model evaluation techniques and discuss when you might want to use them (or when you might not want to). The techniques listed below tend to be on the "easier to use and understand" end of the spectrum, so if you're new to model comparison it's a good place to start. The first question you should be asking is: How well do I know my data?

model evaluation technique part 3, nested model, regression model, (10 more...)

#artificialintelligence

Country: North America > United States > Nebraska > Lancaster County > Lincoln (0.05)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

Nested Model Averaging on Solution Path for High-dimensional Linear Regression

Feng, Yang, Liu, Qingfeng

arXiv.org Machine LearningMay-16-2020

We study the nested model averaging method on the solution path for a high-dimensional linear regression problem. In particular, we propose to combine model averaging with regularized estimators (e.g., lasso and SLOPE) on the solution path for high-dimensional linear regression. In simulation studies, we first conduct a systematic investigation on the impact of predictor ordering on the behavior of nested model averaging, then show that nested model averaging with lasso and SLOPE compares favorably with other competing methods, including the infeasible lasso and SLOPE with the tuning parameter optimally selected. A real data analysis on predicting the per capita violent crime in the United States shows an outstanding performance of the nested model averaging with lasso.

artificial intelligence, machine learning, predictor, (14 more...)

arXiv.org Machine Learning

2005.08057

Country:

North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.82)

Industry: Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

Comparing Model Evaluation Techniques Part 3: Regression Models

#artificialintelligenceJul-29-2019, 07:08:03 GMT

In this post, I'll take a look at how you can compare regression models. Comparing regression models is perhaps one of the trickiest tasks to complete in the "comparing models" arena; The reason is that there are literally dozens of statistics you can calculate to compare regression models, including: This list isn't exhaustive--there are many other tools, tests and plots at your disposal. Rather than discuss the statistics in detail, I chose to focus this post on comparing a few of the most popular regression model evaluation techniques and discuss when you might want to use them (or when you might not want to). The techniques listed below tend to be on the "easier to use and understand" end of the spectrum, so if you're new to model comparison it's a good place to start. The first question you should be asking is: How well do I know my data?

artificial intelligence, machine learning, nested model, (12 more...)

#artificialintelligence

Country: North America > United States > Nebraska (0.15)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

Filters

Collaborating Authors

nested model

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Mixture of Nested Experts: Adaptive Processing of Visual Tokens Gagan Jain Nidhi Hegde Aditya Kusupati

6b768359d0e8925164f61f381a748441-Paper-Conference.pdf

The Morgan-Pitman Test of Equality of Variances and its Application to Machine Learning Model Evaluation and Selection

Savage-Dickey density ratio estimation with normalizing flows for Bayesian model comparison

Masked Generative Nested Transformers with Decode Time Scaling

A Nested Model for AI Design and Validation

Mixture of Nested Experts: Adaptive Processing of Visual Tokens

Comparing Model Evaluation Techniques Part 3: Regression Models - DataScienceCentral.com

Nested Model Averaging on Solution Path for High-dimensional Linear Regression

Comparing Model Evaluation Techniques Part 3: Regression Models